Statistical Computing Software Reviews Multiple Imputation in Practice: Comparison of Software Packages for Regression Models With Missing Variables
نویسندگان
چکیده
Missing data frequently complicates data analysis for scientific investigations. The development of statistical methods to address missing data has been an active area of research in recent decades. Multiple imputation, originally proposed by Rubin in a public use dataset setting, is a general purpose method for analyzing datasets with missing data that is broadly applicable to a variety of missing data settings. We review multiple imputation as an analytic strategy for missing data. We describe and evaluate a number of software packages that implement this procedure, and contrast the interface, features, and results. We compare the packages, and detail shortcomings and useful features. The comparisons are illustrated using examples from an artificial dataset and a study of child psychopathology. We suggest additional features as well as discuss limitations and cautions to consider when using multiple imputation as an analytic strategy for incomplete data settings.
منابع مشابه
Selection of Variables that Influence Drug Injection in Prison: Comparison of Methods with Multiple Imputed Data Sets
Background: Prisoners, compared to the general population, are at greater risk of infection. Drug injection is the main route of HIV transmission, in particular in Iran. What would be of interest is to determine variables that govern drug injection among prisoners. However, one of the issues that challenge model building is incomplete national data sets. In this paper, we addressed the process ...
متن کاملAnalyzing Incomplete Political Science Data: An Alternative Algorithm for Multiple Imputation
W e propose a remedy for the discrepancy between the way political scientists analyze data with missing values and the recommendations of the statistics community. Methodologists and statisticians agree that “multiple imputation” is a superior approach to the problem of missing data scattered through one’s explanatory and dependent variables than the methods currently used in applied data analy...
متن کاملImputation Methods for Handling Item- Nonresponse in the Social Sciences: A Methodological Review
Missing data are often a problem in social science data. Imputation methods fill in the missing responses and lead, under certain conditions, to valid inference. This article reviews several imputation methods used in the social sciences and discusses advantages and disadvantages of these methods in practice. Simpler imputation methods as well as more advanced methods, such as fractional and mu...
متن کاملEstimating interaction effects with incomplete predictor variables.
The existing missing data literature does not provide a clear prescription for estimating interaction effects with missing data, particularly when the interaction involves a pair of continuous variables. In this article, we describe maximum likelihood and multiple imputation procedures for this common analysis problem. We outline 3 latent variable model specifications for interaction analyses w...
متن کاملWhat do we do with missing data? Some options for analysis of incomplete data.
Missing data are a pervasive problem in many public health investigations. The standard approach is to restrict the analysis to subjects with complete data on the variables involved in the analysis. Estimates from such analysis can be biased, especially if the subjects who are included in the analysis are systematically different from those who were excluded in terms of one or more key variable...
متن کامل